This project aims to analyze and visualize different aspects related to living costs in different countries using data from the website numbeo.com. The following points will be explored and visualized:¶

  1. Average salary: The interactive map will be created to show the average salary of each country. The map will use a color-coding system to differentiate countries based on the average salary range.

  2. Rice affordability: The amount of rice that can be bought from the average salary of each country will be calculated and visualized. The visualization will be in the form of a chart or a graph, showing the kilograms of rice that can be bought for a fixed amount of money.

  3. Saving for a flat: The amount of time it takes to save for a 72 square meter flat will be calculated, taking into account that 20% of the monthly salary is saved. The top and bottom ten countries with the shortest and longest savings periods will be identified and visualized.

  4. Rent affordability: The percentage of salary that needs to be spent on rent per month will be calculated and visualized. The visualization will be in the form of a graph, showing the percentage of the salary spent on rent for each country.

Each of these points will be visualized separately using various charts and graphs, enabling a better understanding of the living costs in different countries. The findings will be valuable for people who plan to live or work in a foreign country and want to get an idea of the living costs beforehand..

We will be using data from the website https://www.numbeo.com/cost-of-living/ which was scraped on February 2, 2023.

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import folium
import geopandas as gpd
from IPython.display import display, HTML, IFrame
import requests
In [3]:
file = r'C:\Users\rocze\Desktop\pythonProject\COL.json'
In [4]:
data = pd.read_json(file)

Prior to visualizing the data, we need to clean and organize it through data wrangling.¶

In [5]:
selectedData = data[['CountryName','Coke/Pepsi (0.33 liter bottle)','Water (0.33 liter bottle)','Rice (white), (1kg)','1 Pair of Nike Running Shoes (Mid-Range)','Apartment (1 bedroom) in City Centre','Apartment (3 bedrooms) in City Centre','Price per Square Meter to Buy Apartment in City Centre','Average Monthly Net Salary (After Tax)']]
In [6]:
selectedData = selectedData.applymap(lambda x: x.replace('\u00a0$', '') if isinstance(x, str) else x)
selectedData = selectedData.applymap(lambda x: x.replace(',', '') if isinstance(x, str) else x)
selectedData.iloc[:, 1:] = selectedData.iloc[:, 1:].apply(lambda x: pd.to_numeric(x, errors='coerce'))
In [7]:
selectedData['Coke/Pepsi (0.33 liter bottle)'] = selectedData['Coke/Pepsi (0.33 liter bottle)'].astype(float)
selectedData['Water (0.33 liter bottle)'] = selectedData['Water (0.33 liter bottle)'].astype(float)
selectedData['Rice (white), (1kg)'] = selectedData['Rice (white), (1kg)'].astype(float)
selectedData['1 Pair of Nike Running Shoes (Mid-Range)'] = selectedData['1 Pair of Nike Running Shoes (Mid-Range)'].astype(float)
selectedData['Apartment (1 bedroom) in City Centre'] = selectedData['Apartment (1 bedroom) in City Centre'].astype(float)
selectedData['Apartment (3 bedrooms) in City Centre'] = selectedData['Apartment (3 bedrooms) in City Centre'].astype(float)
selectedData['Price per Square Meter to Buy Apartment in City Centre'] = selectedData['Price per Square Meter to Buy Apartment in City Centre'].astype(float)
selectedData['Average Monthly Net Salary (After Tax)'] = selectedData['Average Monthly Net Salary (After Tax)'].astype(float)
In [8]:
selectedData.columns=['Country','Coke 0.33l','Water 0.33l','Rice 1kg','Nike Runners','1B Apartment','3B Apartment','1SM Price in Center','Avg Salary']
In [9]:
row = selectedData.loc[selectedData['Country'] == "United States"]
selectedData.at[row.index[0], 'Country'] = 'United States of America'
row2 = selectedData.loc[selectedData['Country'] == "Czech Republic"]
selectedData.at[row2.index[0], 'Country'] = 'Czechia'
row3 = selectedData.loc[selectedData['Country'] == "Kosovo (Disputed Territory)"]
selectedData.at[row3.index[0], 'Country'] = 'Kosovo'
row4 = selectedData.loc[selectedData['Country'] == "Republic Of Congo"]
selectedData.at[row4.index[0], 'Country'] = 'Dem. Rep. Congo'
row5 = selectedData.loc[selectedData['Country'] == "South Sudan"]
selectedData.at[row5.index[0], 'Country'] = 'S. Sudan'
row6 = selectedData.loc[selectedData['Country'] == "Bosnia And Herzegovina"]
selectedData.at[row6.index[0], 'Country'] = 'Bosnia and Herz.'

1. Average salary map:

We will be creating a map that displays the average monthly salary in USD for each country. The map will allow us to visualize the disparities in salaries across different regions of the world and highlight the countries with the highest and lowest average salaries. By using color-coding and shading, the map will enable us to identify patterns and trends in the data and provide valuable insights into the economic conditions of different countries.

In [10]:
selectedDataASM = selectedData.copy()
In [11]:
fig = px.choropleth(selectedDataASM, locations="Country", locationmode="country names",
                    color="Avg Salary",
                    hover_name="Country",
                    color_continuous_scale=px.colors.sequential.Bluyl,
                    range_color=(selectedDataASM["Avg Salary"].min(), selectedDataASM["Avg Salary"].max()),
                    labels={"Avg Salary": "USD"})
fig.update_layout(showlegend=False)
fig.update_layout(title="Monthly avarage USD salary", title_x=0.5,width=900, height=700)
fig.show()

Map observation¶

  1. The map suggests that regions such as North America or Europe appear to have the highest average salary, while Oceania also looks competitive. On the other hand, Africa seems to have the smallest income. To validate these observations, we will create a boxplot.
In [12]:
europe = ["Albania", "Andorra", "Austria", "Belarus", "Belgium", "Bosnia and Herz.", "Bulgaria", "Croatia", "Cyprus", "Czechia", "Denmark", "Estonia", "Finland", "France", "Germany", "Greece", "Hungary", "Iceland", "Ireland", "Italy", "Kosovo", "Latvia", "Liechtenstein", "Lithuania", "Luxembourg", "Malta", "Moldova", "Monaco", "Montenegro", "Netherlands", "North Macedonia", "Norway", "Poland", "Portugal", "Romania", "Russia", "San Marino", "Serbia", "Slovakia", "Slovenia", "Spain", "Sweden", "Switzerland", "Ukraine", "United Kingdom", "Vatican City"]
asia = ['Afghanistan', 'Bahrain', 'Bangladesh', 'Bhutan', 'Brunei', 'Cambodia', 'China', 'Cyprus', 'East Timor', 'India', 'Indonesia', 'Iran', 'Iraq', 'Israel', 'Japan', 'Jordan', 'Kazakhstan', 'Kuwait', 'Kyrgyzstan', 'Laos', 'Lebanon', 'Malaysia', 'Maldives', 'Mongolia', 'Myanmar', 'Nepal', 'North Korea', 'Oman', 'Pakistan', 'Palestine', 'Philippines', 'Qatar', 'Russia', 'Saudi Arabia', 'Singapore', 'South Korea', 'Sri Lanka', 'Syria', 'Taiwan', 'Tajikistan', 'Thailand', 'Turkey', 'Turkmenistan', 'United Arab Emirates', 'Uzbekistan', 'Vietnam', 'Yemen']
africa = ["Algeria", "Angola", "Benin", "Botswana", "Burkina Faso", "Burundi", "Cameroon", "Cape Verde", "Central African Republic", "Chad", "Comoros", "Democratic Republic of the Congo", "Republic of the Congo", "Djibouti", "Egypt", "Equatorial Guinea", "Eritrea", "Eswatini (formerly Swaziland)", "Ethiopia", "Gabon", "Gambia", "Ghana", "Guinea", "Guinea-Bissau", "Ivory Coast (Cote d'Ivoire)", "Kenya", "Lesotho", "Liberia", "Libya", "Madagascar", "Malawi", "Mali", "Mauritania", "Mauritius", "Morocco", "Mozambique", "Namibia", "Niger", "Nigeria", "Rwanda", "Sao Tome and Principe", "Senegal", "Seychelles", "Sierra Leone", "Somalia", "South Africa", "South Sudan", "Sudan", "Tanzania", "Togo", "Tunisia", "Uganda", "Zambia", "Zimbabwe"]
northAmerica = ["Canada", "Mexico", "United States of America"]
southAmerica = ['Argentina', 'Bolivia', 'Brazil', 'Chile', 'Colombia', 'Ecuador', 'Guyana', 'Paraguay', 'Peru', 'Suriname', 'Uruguay', 'Venezuela']
oceania = ["Australia", "Fiji", "Kiribati", "Marshall Islands", "Micronesia", "Nauru", "New Zealand", "Palau", "Papua New Guinea", "Samoa", "Solomon Islands", "Tonga", "Tuvalu", "Vanuatu"]
In [13]:
continents = {
    'europe': europe,
    'asia': asia,
    'africa': africa,
    'northAmerica': northAmerica,
    'southAmerica': southAmerica,
    'oceania': oceania
}
selectedDataASM['Continent'] = selectedDataASM['Country'].apply(lambda x: next((k for k, v in continents.items() if x in v), None))
In [14]:
selectedDataASM = selectedDataASM[selectedDataASM['Continent'].notna()]
sns.set(style="ticks")
sns.set(rc={'figure.figsize':(12,8)})
ax = sns.boxplot(x="Continent", y="Avg Salary", data=selectedDataASM, showfliers=True)
sns.stripplot(x="Continent", y="Avg Salary", data=selectedDataASM, jitter=True, color='black', alpha=0.1)
ax.set_xticklabels(["Asia", "Europe", "Africa", "South America","Oceania","North America"])
ax.set_xlabel("Continent")
ax.set_ylabel("Monthly USD Salary")
ax.set_title("Boxplot of monthly salary per continent")
sns.despine()
plt.show()

Based on the boxplot, the following observations can be made:¶

  1. The average salary is highest in Oceania and Europe, as evident from the maximum values.
  2. North America has the highest salary mean among all the regions.
  3. In Asia, the salary distribution is the most diverse, with a mix of high and very low values, and the mean salary is relatively low.
  4. South America and Africa have relatively low salaries, but in Africa, some countries have higher incomes compared to South America.

2. Rice affordability: The amount of rice that can be bought from the average salary of each country will be calculated and visualized. The visualization will be in the form of a chart or a graph, showing the kilograms of rice that can be bought for a fixed amount of money.

Rice affordability refers to the ability of people in a given country to purchase a certain quantity of rice using their average monthly salary. This measure is useful for understanding the purchasing power of people in different countries and regions, as well as for identifying disparities in wealth and income. By calculating the amount of rice that can be bought with the average salary in each country, we can gain insights into the relative affordability of this staple food item. Visualizing this data through a chart or graph allows us to compare and contrast the rice affordability of different countries, helping us to identify trends and patterns that may be of interest to researchers, policymakers, and others.

In [15]:
selectedDataRA = selectedData.copy()
selectedDataRA = selectedDataRA.dropna(subset=['Rice 1kg', 'Avg Salary'])
selectedDataRA['Rice per Salary'] = selectedDataRA['Avg Salary'] / selectedDataRA['Rice 1kg']
In [16]:
sns.set_style('whitegrid')
sns.lmplot(x ='Rice per Salary', y ='Avg Salary', data = selectedDataRA, palette ='plasma')
Out[16]:
<seaborn.axisgrid.FacetGrid at 0x14322e36320>
In [17]:
correlation = selectedDataRA['Rice per Salary'].corr(selectedDataRA['Avg Salary'])
print(round(correlation, 3))
0.789

The correlation score of 0.789 between "Average salary" and "Rice per salary" suggests a strong positive linear relationship between the two variables. This indicates that as the average salary increases, the affordability of rice also increases. In other words, individuals with higher salaries can afford to buy more rice compared to those with lower salaries.

The strength of this correlation score suggests that "Rice per salary" can be a good indicator of the affordability of rice in different countries based on the average salary of its citizens. This information could be useful for policy makers, businesses, and individuals in understanding the relationship between income and food affordability in different regions.

In [20]:
selectedDataRA = selectedDataRA.sort_values(by=['Rice per Salary'])
topFive = selectedDataRA.tail(5)
bottomFive = selectedDataRA.head(5)
figTop = px.treemap(topFive, 
                     path=['Country'], 
                     values='Rice per Salary', 
                     color='Rice per Salary',
                     color_continuous_scale='YlGnBu', 
                     title='Top Five Countries by Rice Affordability',
                     width=800,
                     height=500)
figTop.show()
figBottom = px.treemap(bottomFive, 
                        path=['Country'], 
                        values='Rice per Salary', 
                        color='Rice per Salary',
                        color_continuous_scale='YlGnBu', 
                        title='Bottom Five Countries by Rice Affordability',
                        width=800,
                        height=500)
figBottom.show()

Based on the two treemaps, it is evident that Tuvalu is the country where the highest amount of rice can be purchased, while Cuba has the lowest number of kilograms. The difference between the two countries is more than 5 tons of rice, highlighting the vast disparity in living conditions across the globe.

In conclusion, the amount of salary is a critical factor in determining how much rice one can afford per month, as confirmed by our correlation coefficient and positive regression plot. This is further exemplified by countries with low average salaries, such as Cuba and Syria, where individuals are unable to afford as much rice as those in other countries. On the other hand, countries like Tuvalu and Kiribati with high average salaries show how higher income can lead to greater comfort in terms of purchasing the basic necessity of food.

3. Saving for a 72 meter square flat in the city center

In this study, we will analyze the duration required to save money for purchasing a 72 square meter flat based on the provided monthly salary. We will consider a savings rate of 20% of the salary.

In [21]:
selectedDataASM = selectedDataASM.dropna()
selectedDataASM = selectedDataASM[~selectedDataASM['Country'].isin(['Kiribati', 'Niger'])]
selectedDataASM['72mFlat'] = selectedDataASM['1SM Price in Center']*72
selectedDataASM['Years Needed'] = round((selectedDataASM['72mFlat'] / (selectedDataASM['Avg Salary']*0.2*12)), 2)
selectedDataASM = selectedDataASM.sort_values(by=['Years Needed'])
In [25]:
selectedDataASM = selectedDataASM[selectedDataASM['Continent'].notna()]
selectedDataASM = selectedDataASM[selectedDataASM['Years Needed'].notna()]
sns.set(style="ticks")
sns.set(rc={'figure.figsize':(12,8)})
ax = sns.boxplot(x="Continent", y="Years Needed", data=selectedDataASM, showfliers=False, boxprops=dict(alpha=.8))
ax.set_xticklabels(["Asia", "Europe", "Africa", "South America","Oceania","North America"])
ax.set_xlabel("Continent")
ax.set_ylabel("Years Needed to buy a flat in city center")
sns.despine()
plt.show()

Based on the graph, the region with the highest average time needed to buy a flat in the city center is Oceania, followed by Asia. On the other hand, the situation is better in Europe and North America, with lower average times. Africa8 has the fastest average time needed to buy a flat in the city center among all the continents. This situation in Africa can be attributed to the fact that, although salaries are relatively small, the cost per square meter of flats is also low, making it easier to purchase a flat in the city center.

Let's examine which countries we can afford to buy a flat in during our lifetime. For this case study, let's assume that people have 60 years of working time to earn money to buy a flat.¶

In [48]:
selectedDataASM = selectedDataASM[selectedDataASM['Continent'].notna()]
selectedDataASM = selectedDataASM[selectedDataASM['Years Needed'].notna()]
selectedDataASM = selectedDataASM[selectedDataASM['Years Needed'] < 60]
fig = px.sunburst(selectedDataASM, path=['Continent', 'Country'], values='Years Needed', 
                  color='Years Needed', color_continuous_scale=px.colors.sequential.Blues_r, 
                  hover_name='Country', title='Countries where it is possible to purchase a flat within a 60-year working lifetime')
fig.update_layout(
    margin=dict(l=0, r=0, t=50, b=0),
    title_font=dict(size=20),
    coloraxis_colorbar=dict(title="Years"),
    width=1000,
    height=600)
fig.show()

According to the chart, the countries where one can buy a flat in the shortest amount of time are Liechtenstein, Burkina Faso, and Saudi Arabia. Conversely, it takes nearly 60 years to buy a flat in Liberia, Norway, and Australia.

4. Rent affordability of apartment in city center

In this case study, we will examine the percentage of monthly average salary that goes towards renting one bedroom and three bedroom apartment.

In [28]:
selectedDataRA = selectedData.copy()
selectedDataRA = selectedDataRA.dropna()
In [47]:
selectedDataRA['Rent Percentage 1B'] = selectedDataRA['1B Apartment'] / selectedDataRA['Avg Salary'] * 100
fig = px.choropleth(selectedDataRA, 
                    locations='Country', 
                    locationmode='country names',
                    color='Rent Percentage 1B',
                    color_continuous_scale='Greens',
                    range_color=(0, 200),
                    labels={'Rent Percentage': 'Rent as % of Monthly Salary(USD)'},
                    hover_name='Country',
                    title='One-bedroom apartment rent (USD) as a percentage of monthly salary by country',
                    template='plotly_white')
fig.update_layout(
    width=900,
    height=600)
fig.show()
In [46]:
selectedDataRA['Rent Percentage 3B'] = selectedDataRA['3B Apartment'] / selectedDataRA['Avg Salary'] * 100
fig = px.choropleth(selectedDataRA, 
                    locations='Country', 
                    locationmode='country names',
                    color='Rent Percentage 3B',
                    color_continuous_scale='Greens',
                    range_color=(0, 300),
                    labels={'Rent Percentage': 'Rent as % of Monthly Salary(USD)'},
                    hover_name='Country',
                    title='Three-bedroom apartment rent (USD) as a percentage of monthly salary by country',
                    template='plotly_white')
fig.update_layout(
    width=900,
    height=600)
fig.show()

From the two maps, it is apparent that there are several countries where people cannot afford to rent apartments in the city center. In the next phase, we will determine the exact number of such countries.

In [45]:
rentBelow100 = selectedDataRA[selectedDataRA['Rent Percentage 1B'] < 100]['Country'].count()
totalCountries = selectedDataRA['Country'].nunique()
rentBelow100Pct = rentBelow100 / totalCountries * 100
fig = px.pie(
    values=[rentBelow100, totalCountries - rentBelow100],
    names=['Rent below 100%', 'Rent 100% or above'],
    title=f"Countries where rent of one-bedroom apartment is below 100%: {rentBelow100} ({rentBelow100Pct:.1f}%) of {totalCountries} total countries")
fig.update_layout(
    width=1000,
    height=800,)
fig.show()
In [44]:
rentBelow100 = selectedDataRA[selectedDataRA['Rent Percentage 3B'] < 100]['Country'].count()
totalCountries = selectedDataRA['Country'].nunique()
rentBelow100Pct = rentBelow100 / totalCountries * 100
fig = px.pie(
    values=[rentBelow100, totalCountries - rentBelow100],
    names=['Rent below 100%', 'Rent 100% or above'],
    title=f"Countries where rent of one-bedroom apartment is below 100%: {rentBelow100} ({rentBelow100Pct:.1f}%) of {totalCountries} total countries")
fig.update_layout(
    width=1000,
    height=800,)
fig.show()

The provided data indicates that in 77.8% of countries, one can rent a one-bedroom apartment in the city center with their average monthly salary, while for a three-bedroom apartment, this is only possible in 60.6% of countries. These statistics demonstrate the challenge of finding a place to live in the city center of most countries, leading people to seek housing outside of these areas.

Conclusion:¶

Based on the analyzed data, it is evident that countries with higher levels of development in Europe, Asia, or North America provide more favorable living opportunities compared to less developed countries, particularly in regions such as Asia and Africa. This highlights the ongoing need to strive for equal opportunities and living conditions across the world.